42 research outputs found

    Statistical analysis tools for metabolic and genomic bacterial data

    Get PDF
    This thesis introduces statistical analysis methods for two types of bacterial data: metabolic data produced by phenotype microarray technology, and genomic data produced by sequencing technologies. As both technologies produce vast amounts of data, as well as have special features, there is a need for bioinformatics tools that adequately process and analyze the information produced. Similar to all biomolecular data analyses, the interplay between biological components poses an additional challenge to the method development. A specific complication, regarding the metabolic data, is the lack of larger quantities of replicates due to the high expenses of performing the experiments. In terms of the sequence data, genome-wide analysis tools are desired, since such methods have not yet been widely developed for bacteria, even though they exist for eukaryotic genetics. The thesis briefly reviews the current methods, and introduces new approaches tackling the above mentioned problems.Tässä väitöskirjassa kehitetään uusia tilastollisia analysointimenetelmiä fenotyyppimikrosiru- sekä geenisekvenssidatalle, joista ensimmäinen kuvaa solujen aineenvaihdunnan aktiivisuutta ja jälkimmäinen avaa solun geneettisen koodin. Tilastollisia menetelmiä tarvitaan, kun kyseisillä mittaustekniikoilla tuotettua tietoa halutaan hyödyntää esimerkiksi lääketieteen tarpeisiin vaikkapa uusia hoitomuotoja kehitettäessä. Nykyaikaisille molekyylitason mittauslaitteille on ominaista, että ne tuottavat suuren määrän havaintoja. Lisäksi jokaiseen menetelmään liittyy omat erityispiirteensä, jotka on huomioitava dataa tulkittaessa. Esimerkiksi fenotyyppimikrosirudataa analysoitaessa on huomioitava datan moniulotteinen luonne: yhdellä kokeella voidaan tutkia tuhansia fenotyyppejä yli ajan. Tilastollisten menetelmien kehittämistä ja luotettavaa tilastollista testaamista vaikeuttavat lisäksi pienet toistomäärät sekä datan vähäinen saatavuus, mikä on puolestaan seurausta siitä, että fenotyyppimikrosiruteknologia on vielä melko tuntematon, vähän käytetty menetelmä, joka koetaan hankalaksi tulkita. Geenisekvenssejä analysoitaessa on puolestaan huomioitava esimerkiksi tutkittavan organismin erityispiirteet, sillä eri organismit poikkeavat toisistaan geneettisiltä ominaisuuksiltaan. Ihmisillä geneettisten ominaisuuksien yhteyttä moniin sairauksiin kuten syöpiin on tutkittu esimerkiksi koko genominlaajuisilla assosiaatioanalyysimenetelmillä. Tässä väitöskirjassa esittelemme bakteerien geenisekvenssien analysointia varten kehitetyn koko genominlaajuisen menetelmän, jolla voidaan esimerkiksi kartoittaa bakteerien antibioottiresistenssiin vaikuttavia geneettisiä tekijöitä

    Identifying Multiple Potential Metabolic Cycles in Time-Series from Biolog Experiments

    Get PDF
    Biolog Phenotype Microarray (PM) is a technology allowing simultaneous screening of the metabolic behaviour of bacteria under a large number of different conditions. Bacteria may often undergo several cycles of metabolic activity during a Biolog experiment. We introduce a novel algorithm to identify these metabolic cycles in PM experimental data, thus increasing the potential of PM technology in microbiology. Our method is based on a statistical decomposition of the time-series measurements into a set of growth models. We show that the method is robust to measurement noise and captures accurately the biologically relevant signals from the data. Our implementation is made freely available as a part of an R package for PM data analysis and can be found at www.helsinki.fi/bsg/software/Biolog_Decomposition.Peer reviewe

    Pk-yritysten ekokilpailukyky

    Get PDF
    fi=vertaisarvioimaton|en=nonPeerReviewed

    Novel R pipeline for analyzing Biolog Phenotypic MicroArray data.

    Get PDF
    Data produced by Biolog Phenotype MicroArrays are longitudinal measurements of cells' respiration on distinct substrates. We introduce a three-step pipeline to analyze phenotypic microarray data with novel procedures for grouping, normalization and effect identification. Grouping and normalization are standard problems in the analysis of phenotype microarrays defined as categorizing bacterial responses into active and non-active, and removing systematic errors from the experimental data, respectively. We expand existing solutions by introducing an important assumption that active and non-active bacteria manifest completely different metabolism and thus should be treated separately. Effect identification, in turn, provides new insights into detecting differing respiration patterns between experimental conditions, e.g. between different combinations of strains and temperatures, as not only the main effects but also their interactions can be evaluated. In the effect identification, the multilevel data are effectively processed by a hierarchical model in the Bayesian framework. The pipeline is tested on a data set of 12 phenotypic plates with bacterium Yersinia enterocolitica. Our pipeline is implemented in R language on the top of opm R package and is freely available for research purposes

    Convergent amino acid signatures in polyphyletic Campylobacter jejuni subpopulations suggest human niche tropism

    Get PDF
    Human infection with the gastrointestinal pathogen Campylobacter jejuni is dependent upon the opportunity for zoonotic transmission and the ability of strains to colonize the human host. Certain lineages of this diverse organism are more common in human infection but the factors underlying this overrepresentation are not fully understood. We analyzed 601 isolate genomes from agricultural animals and human clinical cases, including isolates from the multihost (ecological generalist) ST-21 and ST-45 clonal complexes (CCs). Combined nucleotide and amino acid sequence analysis identified 12 human-only amino acid KPAX clusters among polyphyletic lineages within the common disease causing CC21 group isolates, with no such clusters among CC45 isolates. Isolate sequence types within human-only CC21 group KPAX clusters have been sampled from other hosts, including poultry, so rather than representing unsampled reservoir hosts, the increase in relative frequency in human infection potentially reflects a genetic bottleneck at the point of human infection. Consistent with this, sequence enrichment analysis identified nucleotide variation in genes with putative functions related to human colonization and pathogenesis, in human-only clusters. Furthermore, the tight clustering and polyphyly of human-only lineage clusters within a single CC suggest the repeated evolution of human association through acquisition of genetic elements within this complex. Taken together, combined nucleotide and amino acid analysis of large isolate collections may provide clues about human niche tropism and the nature of the forces that promote the emergence of clinically important C. jejuni lineages.Peer reviewe

    Combined Analysis of Variation in Core, Accessory and Regulatory Genome Regions Provides a Super-Resolution View into the Evolution of Bacterial Populations

    Get PDF
    The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements.Peer reviewe

    Genomic signatures of human and animal disease in the zoonotic pathogen Streptococcus suis

    Get PDF
    Streptococcus suis causes disease in pigs worldwide and is increasingly implicated in zoonotic disease in East and South-East Asia. To understand the genetic basis of disease in S. suis, we study the genomes of 375 isolates with detailed clinical phenotypes from pigs and humans from the United Kingdom and Vietnam. Here, we show that isolates associated with disease contain substantially fewer genes than non-clinical isolates, but are more likely to encode virulence factors. Human disease isolates are limited to a single-virulent population, originating in the 1920 s when pig production was intensified, but no consistent genomic differences between pig and human isolates are observed. There is little geographical clustering of different S. suis subpopulations, and the bacterium undergoes high rates of recombination, implying that an increase in virulence anywhere in the world could have a global impact over a short timescale.Peer reviewe
    corecore